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The shikimate pathway provides carbon skeletons for the aromatic amino acids I- 
tryptophan, l-phenylalanine, and l-tyrosine. It is a high flux bearing pathway and it has 
been estimated that greater than 30% of all fixed carbon is directed through this pathway. 
These combined pathways have been subjected to considerable research attention due to 
the fact that mammals are unable to synthesize these amino acids and the fact that one 
of the enzymes of the shikimate pathway is a very effective herbicide target. However, in 
addition to these characteristics these pathways additionally provide important precursors 
for a wide range of important secondary metabolites including chlorogenic acid, alkaloids, 
glucosinolates, auxin, tannins, suberin, lignin and lignan, tocopherols, and betalains. Here 
we review the shikimate pathway of the green lineage and compare and contrast its evo- 
lution and ubiquity with that of the more specialized phenylpropanoid metabolism which 
this essential pathway fuels. 



Keywords: shikimate pathway, aromatic amino biosynthesis, evolution, gene copy number, gene duplication, plant 
secondary phenolic metabolite 



INTRODUCTION 

The shikimate pathway is closely interlinked with those of the 
aromatic amino acids (L-tryptophan, L-phenylalanine, and L- 
tyrosine) and in land plants bears very high fluxes with estimates of 
the amount of fixed carbon passing through the pathway varying 
between 20 and 50% (Weiss, 1986; Corea et al, 2012; Maeda and 
Dudareva, 2012). Considerable research focus has been placed on 
this pathway since the aromatic amino acids are not produced by 
humans and monogastric livestock and are therefore an impor- 
tant dietary component (Tzin and Galili, 2010). Furthermore, 
one of the enzymes of the pathway - 5-e«oZpyruvalshikimate-3- 
phosphate synthase (EPSP) - is one of the most widely employed 
herbicide target sites (see, Duke and Powles, 2008). Moreover, as 
we have recently described, plant phenolic secondary metabo- 
lites and their precursors are synthesized via the pathway of 
shikimate biosynthesis and its numerous branchpoints (Tohge 
et al., 2013). The shikimate pathway is highly conserved being 
found in fungi, bacteria, and plant species wherein it operates 
in the biosynthesis of not just the three aromatic amino acids 
described above but also of innumerable aromatic secondary 
metabolites such as alkaloids, flavonoids, lignins, and aromatic 
antibiotics. Many of these compounds are bioactive as well as 
playing important roles in plant defense against biotic and abi- 
otic stresses and environmental interactions (Hamberger et al, 
2006; Maeda and Dudareva, 2012), and as such are highly physio- 
logically important. It is estimated that under normal conditions 
as much as 20% of the total fixed carbon flows through to shiki- 
mate pathway (Ni et al., 1996), with greater carbon flow through 
the pathway under times of plant stress or rapid growth (Corea 
et al., 2012). Given its importance it is perhaps not surpris- 
ing that all members of biosynthetic genes and corresponding 
enzymes involved in shikimate pathway have been characterized 



in model plants such as Ambidopsis. Cross-species comparison 
of the shikimate biosynthetic enzymes has revealed that they 
share sequence similarity, divergent evolution, and commonal- 
ity in reaction mechanisms (Dosselaere and Vanderleyden, 2001). 
However, all other species vary considerably from fungi which has 
evolved a complex system with a single pentafunctional polypep- 
tide known as the AroM complex which performs five consecutive 
reactions (Lumsden and Coggins, 1977; Duncan et al., 1987). In 
this review we will summarize current knowledge concerning the 
genetic nature of this pathway focusing on cross-species compar- 
isons bridging a wide range of species including algae (Chlamy- 
domonas reinhardtii, Volvox carteri, Micromonas sp., Ostreococcus 
tauri, Ostreococcus lucimarinus), moss (Selaginella moellendorf- 
fii, Physcomitrella patens), monocots {Sorghum bicolor, Zea mays, 
Brachypodium distachyon, Oryza sativa ssp. japonica and Oryza 
sativa ssp. indica), and dicots (Vitis vinifera, Theobroma cacao, 
Carica papaya, Arabidopsis thaliana, Arabidopsis lyrata, Populus tri- 
chocarpa, Ricinus communis, Manihot esculenta, Malus domestica, 
Fragaria vesca, Glycine max, Lotus japonicus, Medicago truncatula) 
species (Table 1). Finally, we compare and contrast the evolution 
of this pathway with that of the more specialized pathways of 
phenylpropanoid biosynthesis. 

SHIKIMATE BIOSYNTHESIS AND PHENYLALANINE DERIVED 
SECONDARY METABOLISM IN PLANTS 

Given that phenolic secondary metabolites which are derived 
from phenylalanine via shikimate biosynthesis are widely distrib- 
uted in plants and other eukaryotes, genes encoding shikimate 
biosynthetic enzymes are generally highly conserved in nature. 
Eight and two reactions are involved in shikimate and phenylala- 
nine biosynthesis, respectively. Both members of all gene families 
and the corresponding biosynthetic enzymes involved in these 
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Table 1 | Summary of the species used in the study. 





Species name 


ID 


Common name 


Classification 


Species 


1 


Chlamydomonas reinhardtii 


CR 


Green algae 


Chlorophyte 


Chlamydomonadaceae 


2 


Volvox carteri 


vc 


Algae 


Chlorophyte 


Volvoceae 


3 


Micromonas sp. RCC299 


MRC 


Micromonas 


Chlorophyta 


Prasinophyceae 


4 


Ostreococcus tauri 


OT 


Microalgae 


Prasinophyte 


Prasinophyceae 


5 


Ostreococcus lucimarinus 


OL 


Microalgae 


Prasinophyte 


Prasinophyceae 


6 


Selaginella moellendorffii 


SM 


Spike moss 


Lycophytes 


Selaginellaceae 


7 


Physcomitrella patens 


PP 


Moss 


Lycophytes 


Funariaceae 


8 


Sorghum bicolor 


SB 


Sorghum 


Monocot 


Poaceae 


9 


Zea mays 


ZM 


Corn 


Monocot 


Poaceae 


10 


Brachypodium distachyon 


BD 


Purple false brome 


Monocot 


Poaceae 


11 


Oryza sativa ssp. japonica 


OS 


Japonica rice 


Monocot 


Poaceae 


12 


Oryza sativa ssp. indica 


OS 


Indica rice 


Monocot 


Poaceae 


13 


Vitis vinifera 


w 


Grapevine 


Dicot 


Vitaceae 


14 


Theobroma cacao 


TC 


Cacao 


Dicot 


Malvaceae 


15 


Carica papaya 


CP 


Papaya 


Dicot 


Caricaceae 


16 


Arabidopsis thaliana 


AT 


Arabidopsis 


Dicot 


Brassicaceae 


17 


Arabidopsis lyrata 


AL 


Lyrata 


Dicot 


Brassicaceae 


18 


Populus trichocarpa 


PT 


Poplar 


Dicot 


Salicaceae 


19 


Ricinus communis 


RC 


Castor oil plant 


Dicot 


Euphorbiaceae 


20 


Manihot esculenta 


ME 


Cassava 


Dicot 


Euphorbiaceae 


21 


Malus domestica 


MD 


Apple 


Dicot 


Rosaceae 


22 


Fragaria vesca 


FV 


Strawberry 


Dicot 


Rosaceae 


23 


Glycine max 


GM 


Soybean 


Dicot 


Fabaceae 


24 


Lotus japonicus 


LJ 


Lotus 


Dicot 


Fabaceae 


25 


Medicago truncatula 


MT 


Medicago 


Dicot 


Fabaceae 



Coding genes is estimated by Plaza (http://bioinformatics.psb.ugent.be/plaza/). Relationships among the species considered are presented on the Plaza website 
(http://bioinformatics.psb.ugent.be/plaza/). 



pathways have been characterized in model plants such as Ara- 
bidopsis (Figure 1A). In contrast, phenolic secondary metabolites 
derived from phenylalanine display considerable species-specific 
distribution with the phenolic secondary metabolites have been 
found in plant kingdom such as coumarin derivatives, monolig- 
nal, lignin, spermidin derivatives, flavonoid, tannin being present 
in specific families within the green lineage (Figure IB). This 
diversity has arisen by the action of diverse evolutionary strate- 
gies for example gene duplication and cw-regulatory evolution in 
order to adapt to prevailing environmental conditions. Given their 
species-specific distribution, the genes involved in plant pheno- 
lic secondary metabolism such as phenylammonia-lyase (PAL), 
polyketide synthase (PKS), 2-oxoglutarate-dependent deoxyge- 
nases (20DDs), and UDP-glycosyltransferases (UGTs) are fre- 
quently used as case studies of plant evolution (Tohge et al, 2013). 
Despite the fact that shikimate-phenylalanine biosynthetic genes 
are well conserved in all species including algae species, phe- 
nolic secondary metabolism related orthologous genes were not 
detected in all algae species (Table 2, Tohge et al., 2013). This result 
suggests a considerably more ancient origin of the shikimate- 
phenylalanine pathways. In the next sections, we will discuss the 
evolution of shikimate-phenylalanine pathways focusing on cross- 
species comparisons for each gene encoding on of the constituent 
enzymes of either pathway. 



3-DE0XY-D-ARABIN0-HEPTUL0S0NATE 7-PHOSPHATE 
SYNTHASE 

The first enzymatic step of the shikimate pathway, 3-deoxy- 
D-arabino-heptulosonate 7-phosphate synthase (DAHPS), cat- 
alyzes an aldol condensation of phosphoeno/pyruvate (PEP), and 
D-erythrose 4-phosphate (E4P) to produce 3-deoxy-D-arabino- 
heptulosonate 7-phosphate (DAHP) (Figure 1 ) . According to their 
protein structure, DAHPSs can be clustered into two distinct 
homology classes. The microbe derived class I DAHPS contain 
a bifunctional chorismate mutase (CM)-DAHPS domains, for 
that reason microbial DAHPSs, for example, E. coli (AroF, G, 
and H) and S. cerevisiae (Aro3 and 4), are classified as class I 
DAHPSs. By contrast, class II DAHPS were previously thought 
to be present only in plant species, but have subsequently been 
reported in certain microbes such as Streptomyces coelicolor, Strep- 
tomyces rimosus, and Neurospora crassa (Bentley, 1990; Maeda and 
Dudareva, 2012). The DAHPS (AroA) and CM (AroQ) activ- 
ities of B. subtilis DAHPS are, however, separated by domain 
truncation. Detailed sequence structure analysis of the bacterial 
AroA and AroQ families, enzymatic studies with the full-length 
protein and the truncated domains of AroA and AroQ of B. 
subtilis, and comparison with fusion proteins of Porphyromonas 
gingivalis in which the AroQ domain was fused to the C termi- 
nus of AroA, suggest that "feedback regulation" may indeed be 
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FIGURE 1 | The shikimate and phenylalanine derived secondary 
metabolite biosynthesis in plants. (A) Shikimate biosynthesis starting from 
phosphoenolpyruvate (PEP) and D-erythrose 4-phosphate is described with 
characterized genes and reported intermediate metabolites. (B) phenylalanine 
derived major phenolic secondary mebolite biosynthesis in the green lineage. 
Arrow indicates enzymatic reaction, circle indicates metabolite. Abbreviation: 
DAHPS, 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase; DQS, 
3-dehydroquinate synthase; DHQD/SD, 3-dehydroquinate dehydratase; SK, 
shikimate kinase; ESPS, 3-phosphoshikimate 1-carboxyvinyltransferase; CS, 
chorismate synthase; CM, chorismate mutase; PAX prephenate 
aminotransferase; ADX arogenate dehydratase. PAL, phenylalanine 
ammonia-lyase; C4H, cinnamate-4-hydroxylase; 4CL, 4-coumarate CoA ligase; 



CAD, cinnamoyl-alcohol dehydrogenase; F5H, ferulate 5-hydroxylase; C3H, 
coumarate 3-hydroxylase; ALDH, aldehyde dehydrogenase; CCR, 
cinnamoyl-CoA reductase; HCX hydroxycinnamoyl-Coenzyme A 
shikimate/quinate hydroxycinnamoyltransferase; CCoAOMT, 
caffeoyl/CoA-3-O-metheltransferase; CHS, chalcone synthese; CHI, chalcone 
isomerase; F3H, flavanone 3-hydroxylase; F3'H, flavonoid-3'-hydroxylase; 
F3GT, flavonoid-3-O-glycosyltransferase; FS, flavone synthase; FOMX 
flavonoid O-methyltransferase; FCGX, flavone-C-glycosyltransferase; FLS, 
flavonol synthese; F3GX, flavonoid-3-O-glycosyltransferase; DFR, 
dihydroflavonol reductase; ANS, Anthocyanidin synthese; AGX 
Flavonoid-O-glycosyltransferase; AAX anthocyanin acyltransferase; BAN, 
oxidoreductase|dihydroflavonol reductase like; LAC, laccase. 



the evolutionary link between the two classes which are evolved 
from primitive unregulated member of class II DAHPS (Wu and 
Woodard, 2006). Class II plant DAHPSs have been reported from 
carrot roots (Suzich et al., 1985) and potato cell culture (Pinto 
et al, 1986; Herrmann and Weaver, 1999). DAHPS is encoded by 
three genes in the Ambidopsis genome (AtDAHPSl, AT4G39980; 
AtDAHPS2, At4g33510; AtDAHPS3, Atlg22410). Orthologous 
gene search queries using the Ambidopsis DAHPSs, revealed a 
single gene in algae species (Chlamydomonas reinhardtii, Volvox 
carteri, Micromonas sp., and Ostreococcus tauri) and Lotus japon- 
ica but two to eight isoforms in other higher plant species 
(Table 2). AtDAHPSl -type and AtDAHPS2 type genes display 
differential expression in Arabidopsis thaliana, Solarium lycoper- 
sicum, and Solanum tuberosum (Maeda and Dudareva, 2012). 
AtDAHPSl-type genes, which are additionally subject to redox 
regulation by the ferredoxin-thioredoxin system, exhibit signifi- 
cant induction by wounding and pathogen infection (Keith et al, 



1991; Gorlach et al., 1995; Maeda and Dudareva, 2012), whereas 
AtDAHPS2 type genes display constitutive expression (Gorlach 
et al., 1995). A phylogenetic analysis of DAHPS genes reveals four 
major clades, (i) a microphyte clade, (ii) a bryophyte duplica- 
tion clade, (iii) monocot and dicot woody species clade, (iv) a 
AtDAHPSs clade (Figure 2Aa). Furthermore, major clade iv has 
four sub-groups, (iv-a) AtDAHPS2 group, (iv-b) monocot, (iv-c) 
AtDAHPSl group and (iv-d) AtDAHP3 group. This result indi- 
cates that the constitutively expressed AtDAHPSl and the stress 
responsive AtDAHPS 3 type genes display well conserved sequence 
between species (clade iv-c and iv-d), whereas the second con- 
stitutively expressed AtDAHPS2 type genes are clearly separated 
between monocot and dicot species (clade iv-a). 

3-DEHYDROQUINATE SYNTHASE 

The second step of the shikimate pathway is catalyzed by 3- 
dehydroquinate synthase (DHQS), an enzyme which promotes 
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the intramolecular exchange of the DAHP ring oxygen with car- 
bon 7 to convert DAHP into 3-dehydroquinate. Unlike the fungal 
situation detailed above, the plant DHQS gene is monofunctional 
and only found as a single copy in all species with the exception 
Glycine max which harbors two genes in its genome (Figure 2Ab). 
Phylogenetic analysis of DHQS genes reveals three major clades 
consisting of (i) microphyte (ii) bryophyte, (iii) monocot, (iv) 
Brassicaceae, and (v) dicot species. Intriguingly, by contrast to 
other shikimate biosynthetic genes, gene expression of DHQS 
gene is not well correlated to phenylpropanoid production in 
Arabidopsis (Hamberger et al, 2006). 

3-DEHYDROQUINATE DEHYDRATASE/SHIKIMATE 
DEHYDROGENASE 

3-Deoxy-D-arabino-heptulosonate 7-phosphate is converted to 
3-dehydroquinate by the bifunctional enzyme 3-dehydroquinate 



dehydratase/shikimate dehydrogenase (DHQD/SD), which cat- 
alyzes firstly the dehydration of DAHP to 3-dehydroshikimate and 
consequently the reversible reduction of this intermediate to shiki- 
mate using NADPH as co-factor. DHQD/SD exists in three forms; 
bacterial specific class I shikimate dehydrogenases (AroE type), 
class II shikimate/quinate dehydrogenases (YdiB type), and class III 
of shikimate dehydrogenase-like (SHD-l type) (Michel et al., 2003; 
Singh et al., 2005). In plants class IV, enzymatic activity of DHQD 
is 10 times higher than SD activity indicating that the amount of 
3-dehydroshikimate will be more than sufficient to support flux 
through the shikimate pathway (Fiedler and Schultz, 1985). This 
bifunctional enzyme plays an important role in regulating metab- 
olism of several phenolic secondary metabolic pathways (Bentley, 
1990; Ding et al, 2007). In general, seed plants contain a single 
DHQD/SD gene which contains a sequence encoding a plastic 
transit peptide in their genome (Maeda et al., 2011, Table 2). 



Shikimate biosynthesis 
a DAHPS 




iv) tandem gene 
duplication in 
woody species 



SB, Sorghum bicolor 
ZM. Zea mays 

BD. Brachypodium distachyon 
OS, Oryza sativa ssp. japonica 
OSINDICA, Oryza sativa ssp. 



• W, Vitis vinifera 

■ TC, Theobroma cacao 
CP, Carica papaya 
AT. Arabidopsis thaliana 
AL. Arabidopsis lyrata 

• PT, Populus trichocarpa 
RC, Ricinus communis 
ME, Manihot esculenta 

• MD, Malus domestica 
9 FY, Fragaria vesca 



• GM. Glycine max 
LJ, Lotus japonicus 
MT. Medicago truncatura 



FIGURE 2 | Continued 
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FIGURE 2 | Phylogenetic tree analysis of shikimate and phenylalanine 
biosynthetic genes in 25 species. Amino acid sequence phylogenetic 
trees of (A) shikimate pathway: (a), DAHPS, (b) DHS, (c) DHQD/SD, (d) 
SK, (e) ESPS, and (f) CS, (B) phenylalanine related genes, (a) CM and (b) 
PAT. Amino acid sequences of shikimate biosynthetic genes are obtained 
from Plaza database (http://bioinformatics.psb.ugent.be/plaza/). 
Relationships among the species considered are presented on the Plaza 



website. The phylogenetic tree was constructed with the aligned protein 
sequences by MEGA (version 5.10; http://www.megasoftware.net/; 
Kumar et al., 2004) using the neighbor-joining method with the following 
parameters: Poisson correction, complete deletion, and bootstrap 
(1000 replicates, random seed). The protein sequences were aligned 
by Plaza. Values on the branches indicate bootstrap support in 
percentages. 



However, an exception to this statement is Nicotiana tabacum 
which contains two genes in its genome. Intriguingly, silencing 
of NtDHD/SHD-1 results strong growth inhibition and reduction 
of the level of aromatic amino acids, chlorogenic acid, and lignin 
contents (Ding et al., 2007), however, a second cytosolic isoform 
can compensate for the production of shikimate but not at the 
phenotypic level. On a more general basis phylogenetic analysis 
reveals that microphytes also contain a low number of DHQD/SD 
genes (between one and two), whilst clear separation between (i) 
the microphyte clade, (ii) bryophyte clade, (iii) monocot clade, 
(iv) woody species-specific tandem gene duplication clade, and (v) 
dicot clades could be observed (Figure 2Ac; Table 2) . Interestingly, 
the observation of the woody species-specific tandem gene dupli- 
cation clade suggests that these species evolved after DHQD/SD 



gene duplication. The cytosolic localization of NtDHD/SHD-2 is 
intriguing since the presence of DAHP synthase, ESPS synthase 
and CM isoforms lacking N-terminal plastid targeting sequences 
has been reported (d'Amato, 1984; Mousdale and Coggins, 1985; 
Ganson et al., 1986). Furthermore, the findings that both ESPS syn- 
thase and shikimate kinase (SK) are active even when they retain 
their target sequences (Dellacioppa et al., 1986; Schmid et al, 1992) 
suggests that they could also potentially be constituents of a cytoso- 
lic pathway. Finally, experiments in which isolated and highly pure 
mitochondria were supplied with 13 C labeled glucose to investi- 
gate the binding of the cytosolic isoforms of glycolysis (Giege et al., 
2003) also revealed 13 C enrichment in shikimate (Sweetlove and 
Fernie, 2013), indicating that a full cytosolic pathway is likely also 
in this species. 
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SHIKIMATE KINASE 

The fifth reaction of the shikimate pathway is catalyzed by SK 
which catalyzes the ATP-dependent phosphorylation of shikimate 
to shikimate 3-phospate (S3P). E. coli has two SKs, one of class 
I (AroL type) and one of II (AroK type) which share only 30% 
sequence identity (Griffin and Gasson, 1995; Whipp and Pittard, 
1995; Herrmann and Weaver, 1999). In plants, different numbers 
of SK isoforms are found in several species; only one in green 
algae, lycophytes, and bryophytes but between one and three in 
monocot and dicot plants (Table 2). A phylogenetic analysis of 
SK genes presents five major clades consisting of (i) microphyte, 
(ii) bryophyte, (iii) dicot woody species-specific clade, (iv) mono- 
cot clade, and (v) dicot species clade (Figure 2Ad). Anaylsis of 
the SK protein of Spinacia olerancea revealed that it was mod- 
ulated by energy status and is therefore similar to bacterial SK 
protein and other ATP-utilizing enzymes (Pacold and Anderson, 
1973; Huang et al., 1975; Schmidt et al., 1990). For this reason it 
has recently been postulated that SK may link to energy requir- 
ing shikimate pathway to the cellular energy balance (Maeda and 
Dudareva, 2012), however, direct experimental support for this 
hypothesis is currently lacking. In Arabidopsis, homologous genes 
named SKL1 and SKL2, which are functionally required for chloro- 
plast biogenesis have been demonstrated to have arisen from SK 
gene duplication (Fucile et al., 2008). SKL1 and SKL2 orthologs 
have been found in several seed plant species, but not in green 
algae (Table 2). 

5-f/V0iKPYRUVYLSHIKIMATE 3-PHOSPHATE SYNTHASE 

The 5-eno/ypyruvylshikimate 3-phosphate synthase (EPSPS, 3- 
phosphoshikimate 1-carboxyvintltransferase) is the sixth step 
and here a second PEP is condensed with S3P to form 5- 
eno/pyruvylshiukimate 3-phosphate (EPSP). Since EPSPS is the 
only known target for the herbicide glyphosate (Steinrucken and 
Amrhein, 1980), isoforms of this enzyme are often classified 
according to their sensitivity of glyphosate, glyphosate sensitive 
EPSPS class I is present in bacteria and plant species, whilst 
glyphosate insensitive EPSPS class II which has been reported 
in certain bacteria such as Agrobacterium (Fucile et al, 2011). 
In plants, different number of EPSPS isoforms is found in sev- 
eral species; only a single isoform in green algae, lycophytes, and 
bryophytes, but either one or two are found in monocot and 
dicot species (Table 2). Phylogenetic analysis of EPSPS genes 
revealed, atypically for genes associated with shikimate metabo- 
lism, that five major groups could be observed; (i) microphyte, (ii) 
bryophyte, (iii) Brassicaceae specific clade, (iv) monocot species, 
and (v) dicot species clade (Figure 2Ae). There are clear indica- 
tions that duplicated EPSPS genes in Arabidopsis, apple, grapevine, 
soybean, and poplar are the result of independent duplication 
events within their lineages with both copies being maintained 
in Arabidopsis (Hamberger et al., 2006), however, the reason for 
the unique divergence in this gene of the pathway is currently 
unclear. 

CH0RISMATE SYNTHASE 

Chorismate, the final product of the shikimate pathway, is subse- 
quently formed by chorismate synthase (CS) which catalyzes the 
trans- 1,4 elimination of phosphate from EPSP. CSs are categorized 



within one of two functional groups (i) fungal type bifunctional 
CS which are associated with NADPH-dependent flavin reductase 
or (ii) bacterial and plant type monofunctional CSs (Schaller et al, 
1991; Maeda and Dudareva, 2012). The reaction catalyzed by CS 
requires flavin mononucleotide (FMN) and its overall reaction is 
redox neutral (Ramjee et al., 1991; Macheroux et al., 1999; Maclean 
and Ali, 2003). The FMN represents supplies an electron donor for 
EPSP which facilitates the cleavage of phosphate. The first cloned 
plant CS gene was that from C. sempervirens (Schaller et al., 1991) 
which contains a sole CS in its genome. Given that this gene has a 
5' plastid import signal sequence, these results indicate that there 
may be no CS outside of the plastid this species. Surveying other 
species revealed that one to two CS genes were present in green 
algae, lycophytes, and bryophytes as well as dicot specie but that 
one to three are present in the genomes of apple and leguminous 
species (Table 2). A phylogenetic analysis of CS genes reveals three 
major clades constituted by (i) microphyte, (ii) monocot, (iii) dicot 
species (Figure 2Af). 

CHORISMATE MUTASE 

Chorismate mutase catalyzes the first step of phenylalanine and 
tyrosine biosynthesis and additionally represents a key step of 
toward the branch split of tryptophan biosynthesis. CM catalyzes 
the transformation of chorismate to prephenate via a Claisen 
rearrangement. The bacterial minor CM proteins ( AroQ type, class 
I CM) display monofunctional enzymatic activity whilst several 
bifunctional CMs such as CM-PDT, CM-PDH, and CM-DAHP 
have been additionally been found in fungi and bacteria (class II 
CM, Euverink et al, 1995; Romero et al, 1995; Chen et al, 2003; 
Baez-Viveros et al, 2004). In spite of the fact of only one CM gene 
is present in algae and lycophyte genomes, more a single gene copy 
(two to five) are found in bryophytes as well as monocot and dicot 
species (Table 2). In seed plants, the CM1 bears a putative plastid 
transit peptide, but CM2 does not and is additionally usually insen- 
sitive to allosteric regulation by aromatic amino acids (Benesova 
andBode, 1992;Eberhardetal, 1996; Maeda and Dudareva, 2012). 
Several plant species, especially dicot plants, have an additional 
CM3 family gene which displays high sequence similarity to CM2 
yet bears a putative plastid transit peptide. For example, Ara- 
bidopsis has three isozymes named AtCMl (At3g29200), AtCM2 
(At5gl0870), and AtCM3 (Atlg69370) (Mobley et al, 1999; Tzin 
and Galili, 2010). Phylogenetic analysis of the CS genes reveals 
three major clades constituting of (i) AtCM2 clade, (ii) microphyte 
and bryophyte clade, and (iii) AtCM2 clade (Figure 2Ba). Addi- 
tionally, clade iii shows two sub-groups, (iii-a) AtCM3 sub-groups 
and (iii-b) AtCMl sub-group (Figure 2Ba) (Eberhard et al., 1996). 
In spite of that the CM2 sub-group contains all species of seed 
plants, monocot species are not contained into AtCM3 sub-group. 
Recently the importance of CM has been extended beyond intra- 
cellular metabolism, In Zea mays, the chorismate mutase Cmul 
secreted by Ustilago maydis, a widespread pathogen character- 
ized by the development of large plant tumors and commonly 
known as smut, is a virulence factor. The uptake of the Ustilago 
CMul protein by plant cells allows rerouting of plant metabo- 
lism and changes the metabolic status of these cells via metabolic 
priming (Djamei et al, 2011). It now appears that secreted CMs 
are found in many plant-related microbes and this form of host 
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manipulation would appear to be a general weapon in the arsenal 
of plant pathogens. 

PREPHENATE AMINOTRANSFERASE AND AROGENATE 
DEHYDRATASE 

Prephenate aminotransferase (PAT) and arogenate dehydratase 
(ADT) catalyze the final steps for production of phenylalanine. 
Whilst ADT was first cloned in 2007 (Cho et al, 2007; Huang 



et al., 2010), it is only more recently that PAT was cloned. Papers 
published in 2011 identified PAT in Petunia hybrid, Arabidopsis 
thaliana, and Solarium lycopersicum (Dal Cin et al, 2011; Maeda 
et al., 2011) and established that it directs carbon flux from 
prephenate to arogenate but also that it is strongly and co- 
ordinately upregulated with genes of primary metabolism and 
phenylalanine derived flavor volatiles. In plant species, a different 
number of PAT isoforms have been found. Although green algae 
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FIGURE 3 | Heat map for isoforms of shikimate-phenylalanine 
biosynthetic genes in plant genomes and hypothetical scheme 
for the evolution of phenylalanine derived phenolic secondary 
metabolism. (A) Heap map overview of number of 



shikimate-phenylalanine biosynthetic gene isoforms in 25 species. (B) 
Hypothetical schematic figure for shikimate-phenylalanine 
biosynthetic genes and their evolution of phenolic secondary 
metabolism. 
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only contain single PAT and ADT genes, monocot species have 
between one and two PATs and between two and four ADTs whilst 
dicot plants genomes contain the same number of PATs but two 
to eight ADTs (Table 2) . Phylogenetic analysis of PAT genes shows 
three major clades of (i) microphyte, (ii) monocot, and (iii) dicot 
species (Figure 2Bb). 

GENES INVOLVED IN PLANT PHENOLIC SECONDARY 
METABOLISMS 

Phenolic secondary metabolism displays an immense chemical 
diversity due to the evolution of enzymatic genes which are 
involved in the various biosynthetic and decorative pathways. 
Such variation is caused by diversity and redundancy of sev- 
eral key genes of phenolic secondary metabolism such as PKSs, 
cytochrome P450s (CYPs),Fe 2+ /2-oxoglutarate-dependentdioxy- 
genases (20DDs), and UDP-glycosyltransferases (UGTs). On the 
other hand, there are other general phenylpropanoid related 
biosynthetic genes, phenylalanine ammonia-lyase (PAL), cinna- 
mate 4-hydroxylase (C4H), and 4-coumarate:coenzyme A ligase 
(4CL), which are required in order to differentiate various classes 
of phenolic secondary metabolism. All of these core genes encode 
important enzymes which activate a number of hydroxycinnamic 
acids to provide precursors for the biosynthesis of lignins, mono- 
lignals, and indeed all other major phenolic secondary metabolites 
in higher plants (Lozoya et al., 1988; Allina et al, 1998; Hu et al., 
1998; Ehlting et al, 1999; Lindermayr et al, 2002; Hamberger and 
Hahlbrock, 2004). Since phenolic secondary metabolism display 
considerable species-specificity, investigation of the genes encod- 
ing the responsible biosynthetic enzymes are frequently used as 
an example of chemotaxonomy for understanding plant evolu- 
tion. However, considering the evolution of these genes in iso- 
lation is rather restrictive a deeper understanding is provided 
by combining this with investigation of the evolution of the 
shikimate-phenylalanine biosynthetic genes in the green lineage. 

CONCLUSION 

During the long evolutionary period covered from aquatic algae 
to land plants, plants have adapted to the environmental niches 
with the evolutionary strategies such as gene duplication and 



convergent evolution by the filtration of natural selection. Genes of 
plant shikimate biosynthesis have evolved accordingly (Figure 3). 
In this review, we demonstrated that biosynthetic genes of 
aromatic amino acid primary metabolism are well conserved 
between algae and all land plants. However, in contrast to algae 
species which have neither isoforms nor duplicated genes in their 
genomes, all land plants harbor gene duplications including tan- 
dem gene duplications which are particularly prominent in the 
cases of DAHPS, DHQD/SD, CS, CM, and ADT (Figure 3A; 
Table 2). Our phylogenetic analysis revealed clear separation 
between algae, monocots, dicots, woody species, and leguminous 
plants. Analysis of the presence and copy number of key genes 
across these species gives several hints as to how to improve 
our understanding of the scaffold from which these genes have 
evolved. However, the exact evolutionary pressures on genes of 
shikimate biosynthesis including the unique occurrence of the 
Arom complex will require considerable further studies. That said 
it is intriguing to compare and contrast biosynthetic genes of 
those downstream of them in the production of plant phenolics 
(Figure 3B). Interestingly, shikimate pathway genes are ubiquitous 
across the green lineage whilst this cannot be said for all down- 
stream genes of phenylpropanoid biosynthesis. Furthermore, there 
is a much greater gene duplication within phenylpropanoid than 
shikimate biosynthesis (Figure 3A; Table 2) . This fact also reflected 
in the level of chemical diversity of the respective pathways with the 
essentiality of the shikimate pathway preventing much diversity, 
but phenylpropanoid species often being redundant in function to 
one another. It would seem likely that the phenylpropanoid path- 
way initially arose via mutations accumulating in the shikimate 
pathway genes. However, whilst these were potentially beneficial 
in land plants for reasons we discuss in our recent review of these 
compounds (Tohge et al., 2013) they do not appear to share the 
essentiality of shikimate across the entire green lineage. 
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